Setup¶
Install the Signal Ocean SDK:
pip install signal-ocean
Set your subscription key acquired here: https://apis.signalocean.com/profile
!pip install signal-ocean
signal_ocean_api_key = '' #replace with your subscription key
Call the Voyages API¶
The Voyages API retrieves information about vessel voyages.
from signal_ocean import Connection
from signal_ocean.voyages import VoyagesAPI
from signal_ocean.voyages import Vessel, VesselFilter
from signal_ocean.voyages import VesselType, VesselTypeFilter
from signal_ocean.voyages import VesselClass, VesselClassFilter
import pandas as pd
from datetime import date, timedelta, datetime
import plotly.express as px
connection = Connection(signal_ocean_api_key)
api = VoyagesAPI(connection)
Get vessel class ids for Dry vessel classes
vessel_classes = api.get_vessel_classes()
vessel_classes_df = pd.DataFrame(v.__dict__ for v in vessel_classes)
vessel_classes_df[vessel_classes_df['vessel_type']=='Dry'].head(10)
| vessel_class_id | vessel_class_name | vessel_type_id | vessel_type | |
|---|---|---|---|---|
| 4 | 69 | VLOC | 3 | Dry |
| 5 | 70 | Capesize | 3 | Dry |
| 6 | 72 | Post Panamax | 3 | Dry |
| 7 | 74 | Panamax | 3 | Dry |
| 8 | 75 | Supramax | 3 | Dry |
| 9 | 76 | Handymax | 3 | Dry |
| 10 | 77 | Handysize | 3 | Dry |
| 19 | 92 | Small | 3 | Dry |
Get voyages for Capesize¶
For Capesize we will use (vessel_class_id = 70) when calling the API
voyages = api.get_voyages_condensed(vessel_class_id=70, date_from=datetime.strptime("2021-09-01", "%Y-%m-%d"))
voyages = pd.DataFrame([v.__dict__ for v in voyages])
voyages.tail(5)
| imo | voyage_number | vessel_type_id | vessel_class_id | vessel_status_id | commercial_operator_id | deleted | events | id | horizon_id | ... | last_discharge_sailing_date | last_discharge_country_id | last_discharge_country_name | last_discharge_area_id_level0 | last_discharge_area_name_level0 | repairs_ind | storage_ind | sts_load_ind | sts_discharge_ind | local_trade_ind | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 18363 | 9966221 | 1 | 3 | 70 | 1 | NaN | False | None | I98128D46VEDC4A7000 | 2 | ... | NaT | NaN | None | None | None | True | False | False | False | True |
| 18364 | 9974072 | 1 | 3 | 70 | 1 | NaN | False | None | I98313846VEDC4A7000 | 2 | ... | NaT | NaN | None | None | None | False | False | False | False | True |
| 18365 | 9974084 | 1 | 3 | 70 | 1 | NaN | False | None | I98314446VEDC22E300 | 2 | ... | 2023-09-19 21:56:32.408000+00:00 | 55.0 | China | None | Central China | True | False | False | False | False |
| 18366 | 9981958 | 1 | 3 | 70 | 1 | NaN | False | None | I98500646VEDA976100 | 1 | ... | 2023-06-27 11:41:04+00:00 | 248.0 | Viet Nam | None | Thailand / Vietnam | False | False | True | True | True |
| 18367 | 9981958 | 2 | 3 | 70 | 1 | NaN | False | None | I98500646VEDC22E300 | 2 | ... | 2023-09-11 21:07:57.071000+00:00 | 248.0 | Viet Nam | None | Thailand / Vietnam | False | False | True | False | True |
5 rows × 83 columns
Convert Timestamp to datetime, drop voyages with first_load_sailing_date in the future (ongoing or future)
voyages['first_load_sailing_date'] = voyages['first_load_sailing_date'].apply(lambda d : pd.to_datetime(d).tz_localize(None))
voyages=voyages[voyages['first_load_sailing_date']<datetime.now()]
Examine Capesize Australia Capesize Coal Exports¶
Group by origin and destination (first load, last discharge), and use first_load_sailing_date as date of export
coal_exports_country_level = voyages[(voyages['first_load_sailing_date']>"2022-01-01")&(voyages['cargo_group']=='Coal')] \
.set_index('first_load_sailing_date') \
.groupby(['first_load_country_name', 'last_discharge_country_name'])['quantity'] \
.resample('MS') \
.sum() \
.reset_index() \
.rename(columns={'first_load_sailing_date': 'Date'})
Drop intra country flows
coal_exports_country_level = coal_exports_country_level[coal_exports_country_level['first_load_country_name']!=coal_exports_country_level['last_discharge_country_name']]
coal_exports_country_level.head()
| first_load_country_name | last_discharge_country_name | Date | quantity | |
|---|---|---|---|---|
| 1 | Australia | Belgium | 2022-04-01 | 131000.0 |
| 2 | Australia | Brazil | 2022-01-01 | 307000.0 |
| 3 | Australia | Brazil | 2022-02-01 | 0.0 |
| 4 | Australia | Brazil | 2022-03-01 | 0.0 |
| 5 | Australia | Brazil | 2022-04-01 | 0.0 |
- Keep only Australia exports
- Aggregate all exports monthly
- Drop unnecessary columns
australia_coal_exports = coal_exports_country_level[(coal_exports_country_level['first_load_country_name']=='Australia')].copy()
australia_coal_exports['TotalExports'] = australia_coal_exports['quantity'].groupby(australia_coal_exports['Date']).transform('sum')
australia_coal_exports.drop_duplicates(subset=['Date'], keep='last', inplace = True)
australia_coal_exports.drop(['last_discharge_country_name', 'quantity'], axis=1, inplace = True)
australia_coal_exports.head()
| first_load_country_name | Date | TotalExports | |
|---|---|---|---|
| 306 | Australia | 2022-01-01 | 10931000.0 |
| 307 | Australia | 2022-02-01 | 10408000.0 |
| 308 | Australia | 2022-03-01 | 10080000.0 |
| 309 | Australia | 2022-04-01 | 8808000.0 |
| 310 | Australia | 2022-05-01 | 12162000.0 |
Plot Australia Capesize Coal exports¶
australia_coal_exports['Year'] = australia_coal_exports['Date'].apply(lambda date : date.year)
australia_coal_exports['Month'] = australia_coal_exports['Date'].apply(lambda date : date.month)
australia_coal_exports.sort_values(['Month', 'Year'], inplace = True)
australia_coal_exports['Month'] = australia_coal_exports['Date'].apply(lambda date : date.strftime("%b"))
fig = px.histogram(australia_coal_exports, x="Month", y="TotalExports", color='Year',
barmode='group', height=500, width=1000,
title='Australia Capesize Coal Exports',
color_discrete_sequence=["lightgray", "gray", "lightblue"])
fig.show()
Examine Australia to China Capesize Iron Ore Flows - Imports to China¶
Convert Timestamp to datetime, drop voyages with last_discharge_arrival_date in the future (ongoing or future voyages)
voyages['last_discharge_arrival_date'] = voyages['last_discharge_arrival_date'].apply(lambda d : pd.to_datetime(d).tz_localize(None))
voyages=voyages[voyages['last_discharge_arrival_date']<datetime.now()]
voyages.head()
| imo | voyage_number | vessel_type_id | vessel_class_id | vessel_status_id | commercial_operator_id | deleted | events | id | horizon_id | ... | last_discharge_sailing_date | last_discharge_country_id | last_discharge_country_name | last_discharge_area_id_level0 | last_discharge_area_name_level0 | repairs_ind | storage_ind | sts_load_ind | sts_discharge_ind | local_trade_ind | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 9123180 | 69 | 3 | 70 | 1 | 832.0 | False | None | I8B356CVED8BCC500 | 1 | ... | 2021-10-04 19:54:56+00:00 | 112.0 | India | None | East Coast India | False | False | False | False | False |
| 1 | 9123180 | 70 | 3 | 70 | 1 | 832.0 | False | None | I8B356CVED8E45200 | 1 | ... | 2021-11-27 23:58:17+00:00 | 112.0 | India | None | East Coast India | False | False | False | False | False |
| 2 | 9123180 | 71 | 3 | 70 | 1 | 832.0 | False | None | I8B356CVED9336C00 | 1 | ... | 2022-01-20 23:56:47+00:00 | 112.0 | India | None | East Coast India | False | False | False | False | False |
| 3 | 9123180 | 72 | 3 | 70 | 1 | 832.0 | False | None | I8B356CVED95AF900 | 1 | ... | 2022-01-31 19:57:54+00:00 | 112.0 | India | None | East Coast India | False | False | False | False | True |
| 4 | 9123180 | 73 | 3 | 70 | 1 | 832.0 | False | None | I8B356CVED9828600 | 1 | ... | 2022-02-11 19:57:47+00:00 | 112.0 | India | None | East Coast India | False | False | False | False | True |
5 rows × 83 columns
- Keep only voyages with 'Iron Ore' as cargo group
- Group by origin and destination (first load, last discharge), and use last_discharge_arrival_date as date of flow
iron_ore_flows_country_level = voyages[(voyages['last_discharge_arrival_date']>"2022-01-01")&(voyages['cargo_group']=='Iron Ore')] \
.set_index('last_discharge_arrival_date') \
.groupby(['first_load_country_name', 'last_discharge_country_name'])['quantity'] \
.resample('MS') \
.sum() \
.reset_index() \
.rename(columns={'last_discharge_arrival_date': 'Date'})
iron_ore_flows_country_level.head()
| first_load_country_name | last_discharge_country_name | Date | quantity | |
|---|---|---|---|---|
| 0 | Australia | Australia | 2022-01-01 | 310000.0 |
| 1 | Australia | Australia | 2022-02-01 | 163000.0 |
| 2 | Australia | Australia | 2022-03-01 | 167000.0 |
| 3 | Australia | Australia | 2022-04-01 | 420000.0 |
| 4 | Australia | Australia | 2022-05-01 | 144000.0 |
Keep only flows between Australia and China
china_australia_iron_ore_flows = iron_ore_flows_country_level[(iron_ore_flows_country_level['first_load_country_name']=='Australia')&(iron_ore_flows_country_level['last_discharge_country_name']=='China')].copy()
china_australia_iron_ore_flows.head()
| first_load_country_name | last_discharge_country_name | Date | quantity | |
|---|---|---|---|---|
| 20 | Australia | China | 2022-01-01 | 53706000.0 |
| 21 | Australia | China | 2022-02-01 | 40645000.0 |
| 22 | Australia | China | 2022-03-01 | 47993000.0 |
| 23 | Australia | China | 2022-04-01 | 46727000.0 |
| 24 | Australia | China | 2022-05-01 | 51636000.0 |
Plot Australia to China Capesize Iron Ore Flows - Imports to China¶
china_australia_iron_ore_flows['Year'] = china_australia_iron_ore_flows['Date'].apply(lambda date : date.year)
china_australia_iron_ore_flows['Month'] = china_australia_iron_ore_flows['Date'].apply(lambda date : date.month)
china_australia_iron_ore_flows.sort_values(['Month', 'Year'], inplace = True)
china_australia_iron_ore_flows['Month'] = china_australia_iron_ore_flows['Date'].apply(lambda date : date.strftime("%b"))
fig = px.histogram(china_australia_iron_ore_flows, x="Month", y="quantity", color='Year',
barmode='group', height=500, width=1000,
title='Australia to China Capesize Iron Ore Flows - Imports to China',
color_discrete_sequence=["lightgray", "gray", "lightblue"])
fig.show()
Build your custom Flows - Brazil Grain Exports to specific countries by Panamax and Supramax¶
Get vessel class ids for Dry vessel classes
vessel_classes = api.get_vessel_classes()
vessel_classes_df = pd.DataFrame(v.__dict__ for v in vessel_classes)
vessel_classes_df[vessel_classes_df['vessel_type']=='Dry'].head(10)
| vessel_class_id | vessel_class_name | vessel_type_id | vessel_type | |
|---|---|---|---|---|
| 4 | 69 | VLOC | 3 | Dry |
| 5 | 70 | Capesize | 3 | Dry |
| 6 | 72 | Post Panamax | 3 | Dry |
| 7 | 74 | Panamax | 3 | Dry |
| 8 | 75 | Supramax | 3 | Dry |
| 9 | 76 | Handymax | 3 | Dry |
| 10 | 77 | Handysize | 3 | Dry |
| 19 | 92 | Small | 3 | Dry |
Select Vessel Classes of interest¶
Get voyages for selected vessel classes (Panamax and Supramax) starting from a given date
vessel_class_ids = (74, 75) # Get only voyages for Panamax and Supramax
date_from = datetime.strptime("2021-09-01", "%Y-%m-%d")
voyages = []
for vessel_class_id in vessel_class_ids:
voyages.append(api.get_voyages_condensed(vessel_class_id=vessel_class_id, date_from=date_from))
voyages = pd.DataFrame([v.__dict__ for vc in voyages for v in vc])
voyages.tail(5)
| imo | voyage_number | vessel_type_id | vessel_class_id | vessel_status_id | commercial_operator_id | deleted | events | id | horizon_id | ... | last_discharge_sailing_date | last_discharge_country_id | last_discharge_country_name | last_discharge_area_id_level0 | last_discharge_area_name_level0 | repairs_ind | storage_ind | sts_load_ind | sts_discharge_ind | local_trade_ind | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 94352 | 9978755 | 1 | 3 | 75 | 1 | NaN | False | None | I9843834BVEDC71FD00 | 2 | ... | NaT | NaN | None | None | None | True | False | False | False | True |
| 94353 | 9979199 | 1 | 3 | 75 | 1 | NaN | False | None | I98453F4BVEDBFB5600 | 2 | ... | 2023-09-30 02:20:49.132000+00:00 | 56.0 | Colombia | None | West Coast South America | True | False | False | False | False |
| 94354 | 9979204 | 1 | 3 | 75 | 1 | NaN | False | None | I9845444BVEDC4A7000 | 2 | ... | NaT | NaN | None | None | None | True | False | False | False | True |
| 94355 | 9982055 | 1 | 3 | 75 | 1 | NaN | False | None | I9850674BVEDC22E300 | 2 | ... | 2023-10-03 18:08:51.652000+00:00 | 121.0 | Japan | None | Japan Island | True | False | False | False | False |
| 94356 | 9982067 | 1 | 3 | 75 | 1 | NaN | False | None | I9850734BVEDC4A7000 | 2 | ... | NaT | NaN | None | None | None | True | False | False | False | True |
5 rows × 83 columns
Date of flows will be based on the export date
- Convert Timestamp to datetime, drop voyages with first_load_sailing_date in the future (ongoing or future)
voyages['first_load_sailing_date'] = voyages['first_load_sailing_date'].apply(lambda d : pd.to_datetime(d).tz_localize(None))
voyages=voyages[voyages['first_load_sailing_date']<datetime.now()]
voyages.head()
| imo | voyage_number | vessel_type_id | vessel_class_id | vessel_status_id | commercial_operator_id | deleted | events | id | horizon_id | ... | last_discharge_sailing_date | last_discharge_country_id | last_discharge_country_name | last_discharge_area_id_level0 | last_discharge_area_name_level0 | repairs_ind | storage_ind | sts_load_ind | sts_discharge_ind | local_trade_ind | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 7514684 | 309 | 3 | 74 | 1 | 3968.0 | False | None | I72AA3CVED8BCC500 | 1 | ... | 2021-09-12 15:57:31+00:00 | 45.0 | Canada | None | Great Lakes | False | False | False | False | False |
| 1 | 7514684 | 310 | 3 | 74 | 1 | 3968.0 | False | None | I72AA3CVED8C9F400 | 1 | ... | 2021-09-28 11:58:58+00:00 | 240.0 | United States | None | Great Lakes | False | False | False | False | True |
| 2 | 7514684 | 311 | 3 | 74 | 1 | 3968.0 | False | None | I72AA3CVED8E45200 | 1 | ... | 2021-10-06 07:56:49+00:00 | 240.0 | United States | None | Great Lakes | False | False | False | False | True |
| 3 | 7514684 | 312 | 3 | 74 | 1 | 3968.0 | False | None | I72AA3CVED8EAE980 | 1 | ... | 2021-10-14 03:57:28+00:00 | 240.0 | United States | None | Great Lakes | False | False | False | False | True |
| 4 | 7514684 | 313 | 3 | 74 | 1 | 3968.0 | False | None | I72AA3CVED8F18100 | 1 | ... | 2021-10-20 19:56:28+00:00 | 240.0 | United States | None | Great Lakes | False | False | False | False | True |
5 rows × 83 columns
Select cargo types of interest¶
Examine different cargo types under the "Grains" cargo group
voyages[(voyages['cargo_group']=='Grains')&(voyages['cargo_type']!='Grains')].drop_duplicates(subset=['cargo_group', 'cargo_type'], keep='last', inplace = False)[['cargo_group', 'cargo_type']].head(10)
| cargo_group | cargo_type | |
|---|---|---|
| 93802 | Grains | Sorghum |
| 94096 | Grains | Soybeans |
| 94170 | Grains | Rice |
| 94211 | Grains | Barley |
| 94336 | Grains | Corn |
| 94353 | Grains | Wheat |
voyages[(voyages['first_load_country_name']=='Brazil')&(voyages['cargo_group']=='Grains')]['cargo_type'].value_counts().head()
Soybeans 2565 Corn 997 Wheat 51 Barley 20 Rice 9 Name: cargo_type, dtype: int64
- Keep only voyages with 'Soybeans' and 'Corn' as cargo type
- Group by origin and destination (first load, last discharge) and cargo type
- Use last_discharge_arrival_date as date of flow
cargo_types_of_interest = ('Soybeans', 'Corn')
custom_flows_country_level = voyages[(voyages['first_load_sailing_date']>"2022-01-01")&(voyages['cargo_type'].isin(cargo_types_of_interest))] \
.set_index('first_load_sailing_date') \
.groupby(['first_load_country_name', 'last_discharge_country_name', 'cargo_type'])['quantity'] \
.resample('MS') \
.sum() \
.reset_index() \
.rename(columns={'first_load_sailing_date': 'Date'})
custom_flows_country_level.head()
| first_load_country_name | last_discharge_country_name | cargo_type | Date | quantity | |
|---|---|---|---|---|---|
| 0 | Albania | Belgium | Corn | 2022-05-01 | 52000.0 |
| 1 | Algeria | Algeria | Corn | 2022-04-01 | 29000.0 |
| 2 | Algeria | Belgium | Corn | 2022-02-01 | 47000.0 |
| 3 | Algeria | Belgium | Corn | 2022-03-01 | 0.0 |
| 4 | Algeria | Belgium | Corn | 2022-04-01 | 46000.0 |
Select flows of interest¶
Keep only flows between Brazil and (China, South Korea, Japan)
import_countries = ('China', 'Korea, Republic of', 'Japan')
custom_flows = custom_flows_country_level[(custom_flows_country_level['first_load_country_name']=='Brazil')&(custom_flows_country_level['last_discharge_country_name'].isin(import_countries))].copy()
custom_flows.head()
| first_load_country_name | last_discharge_country_name | cargo_type | Date | quantity | |
|---|---|---|---|---|---|
| 904 | Brazil | China | Corn | 2022-01-01 | 51000.0 |
| 905 | Brazil | China | Corn | 2022-02-01 | 61000.0 |
| 906 | Brazil | China | Corn | 2022-03-01 | 358000.0 |
| 907 | Brazil | China | Corn | 2022-04-01 | 189000.0 |
| 908 | Brazil | China | Corn | 2022-05-01 | 107000.0 |
Plot custom flows¶
custom_flows['Year'] = custom_flows['Date'].apply(lambda date : date.year)
custom_flows['Month'] = custom_flows['Date'].apply(lambda date : date.month)
custom_flows.sort_values(['Year', 'Month'], inplace = True)
custom_flows['Month'] = custom_flows['Date'].apply(lambda date : date.strftime("%b"))
custom_flows['Year'] = custom_flows['Date'].apply(lambda date : date.strftime("%Y"))
custom_flows['MonthYear'] = custom_flows['Month'].astype(str) + ' ' + custom_flows['Year'].astype(str)
fig = px.histogram(custom_flows, x="MonthYear", y="quantity", color='cargo_type',
height=500, width=1000, title='custom Flows - Brazil Grain Exports to (China, South Korea, Japan)',
color_discrete_sequence=px.colors.qualitative.Pastel)
fig.update_layout(bargap=0.2)
fig.show()